Active Learning Entropy Sampling based Clustering Optimization Method for Electricity Data

نویسندگان

چکیده

Clustering is a crucial part in the field of data mining, and common clustering methods include divisionbased methods, hierarchy-based density-based grid-based methods. In order to improve accuracy clustering, an optimization study made mainly for division-based method FCM that integrates active learning principal component analysis (PCA) proposed. The first uses reduce dimensionality computation electricity data, then trains sample model by learning, introduces entropy (Entropy) uncertainty sampling method, larger means greater sample, smaller so as filter finally are clustered power categorized with proliferation can be more accurately using this achieve stability grid well utilization rate. Experimental results on three datasets show improves up 2 percentage points compared traditional without achieves good each dataset other

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Entropy-based Consensus for Distributed Data Clustering

The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...

متن کامل

Clustering Based Active Learning for Evolving Data Streams

Data labeling is an expensive and time-consuming task. Choosing which labels to use is increasingly becoming important. In the active learning setting, a classifier is trained by asking for labels for only a small fraction of all instances. While many works exist that deal with this issue in non-streaming scenarios, few works exist in the data stream setting. In this paper we propose a new acti...

متن کامل

The Learning-Curve Sampling Method Applied to Model-Based Clustering

We examine the learning-curve sampling method, an approach for applying machinelearning algorithms to large data sets. The approach is based on the observation that the computational cost of learning a model increases as a function of the sample size of the training data, whereas the accuracy of a model has diminishing improvements as a function of sample size. Thus, the learning-curve sampling...

متن کامل

Clustering-Based Active Learning

In contexts where obtaining labels for data points is expensive, active learning is a widely used methodology for deciding which data points to get a label for. In this project, we propose and evaluate an active learning algorithm in context of an automatic grading and feedback generation tool for an online embedded systems course currently under development at UC Berkeley. This tool learns a f...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Database Management Systems

سال: 2022

ISSN: ['0975-5705', '0975-5985']

DOI: https://doi.org/10.5121/ijdms.2022.14601